Constructing Transaction Serialization Order for Incremental Data Warehouse Refresh
نویسنده
چکیده
In typical practice of data warehouse, the warehouse data is kept in a separate site from the operation data. Changes to the operation data to propagated to the warehouse site periodically, usually by shipping the new log entries in the operation site. When the operation site is a parallel a parallel DBMS, it is necessary to arrange the log entries into an coherent order for the warehouse data to maintain consistency and integrity. This paper solves the problem by presenting a simple method for constructing a global transaction order out of a set of local logs of a parallel or distributed system. For serial database systems, transaction order can easily be inferred from local database logs. For parallel or distributed database systems, since each local log les has only partial information of each transaction, this problem becomes diiculty. In this paper, we present a simple method that determines the order of transactions by examining only the relative position of their vote and commit entries, and needs not looking into what data items each transaction accesses. Based on this mechanism, we present a method that eeciently constructs a global transaction order. Conceptually, our method rst connects the vote and commit symbols of local log les into a network. Then it traverse the network to detect inconsistency in the ordering of commit symbols in diierent log. It then resolves the inconsistency by adjusting the position of commit symbols in the network. Our method has the following features. First, it examines only the vote and commit entries of transactions, but not the data accessing entries. Second, for vote or commit 1 entries, only their relative positions and their transaction ID (to identify from which transaction an entry comes) requires attention. No other information recorded in the vote or commit entry needs be examined. Finally, it has low complexity. For a set of local logs with l totally, only O(l) cost is required both in terms of time and space.
منابع مشابه
Incremental Data Mining Using Concurrent Online Refresh of Materialized Data Mining Views
Data mining is an iterative process. Users issue series of similar data mining queries, in each consecutive run slightly modifying either the definition of the mined dataset, or the parameters of the mining algorithm. This model of processing is most suitable for incremental mining algorithms that reuse the results of previous queries when answering a given query. Incremental mining algorithms ...
متن کاملFormalizing ETL Jobs for Incremental Loading of Data Warehouses
Extract-transform-load (ETL) tools are primarily designed for data warehouse loading, i.e. to perform physical data integration. When the operational data sources happen to change, the data warehouse gets stale. To ensure data timeliness, the data warehouse is refreshed on a periodical basis. The naive approach of simply reloading the data warehouse is obviously inefficient. Typically, only a s...
متن کاملIncremental Load in a Data Warehousing Environment
Incremental load is an important factor for successful data warehousing. Lack of standardized incremental refresh methodologies can lead to poor analytical results, which can be unacceptable to an organization’s analytical community. Successful data warehouse implementation depends on consistent metadata as well as incremental data load techniques. If consistent load timestamps are maintained a...
متن کاملافزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته
Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...
متن کاملIncremental View Maintenance: An Algorithmic Approach
To maintain the materialized view is one of the crucial tasks in a warehousing environment. The results of incremental computation are affected by interfering updates and compensation is required. The conventional approaches used incremental algorithm causes some anomalies. To solve such anomalies we proposed the novel approach an incremental view maintenance approach by using some existing app...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997